Large vocabulary continuous speech recognition of read speech over cellular and landline networks

نویسندگان

  • Ashwin Rao
  • Bob Roth
  • Venkatesh Nagesha
  • Don McAllaster
  • Natalie Liberman
  • Larry Gillick
چکیده

We report results of large vocabulary continuous speech recognition (LVCSR) experiments, conducted using speech data read over cellular and landline phones. Specifically, we compare (using stereo recordings) the speaker-independent and speakeradapted recognition word error rates (WERs) measured over cellular and landline networks, with those measured using a closetalking noise-canceling headset microphone, which serves as a baseline. A test set consisting of speech data recorded by 25 speakers is used; each speaker providing test and adaptation data. We use acoustic models trained from relatively high-quality training data and an interpolated trigram language model. Some insights into the relative degradation in WERs over telephone networks are also provided by examining the recognition error rates for bandlimited and coded microphone speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Cross-domain robust acoustic training

This paper describes our efforts towards cross-domain acoustic training for Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We used weighted multi-style training by pooling insufficient telephony landline and cellular data with down sampled wide band clean data to develop better hybrid acoustic models. We explored the effects on decision tree size to accuracy by approximately 10...

متن کامل

Audio-visual large vocabulary continuous speech recognition in the broadcast domain

We consider the problem of combining visual cues with audio signals for the purpose of improved automatic machine recognition of speech. Although signi cant progress has been made in machine transcription of large vocabulary continuous speech (LVCSR) over the last few years, the technology to date is most e ective only under controlled conditions such as low noise, speaker dependent recognition...

متن کامل

Codebook Dependent Dynami for Mandarin Speech Recogn

Automatic speech recognition in telecommunications environment still has a lower correct rate compared to its desktop pairs. Improving the performance of telephone-quality speech recognition is an urgent problem for its application in those practical fields. Previous works have shown that the main reason for this performance degradation is the variational mismatch caused by different telephone ...

متن کامل

An auditory system-based feature for robust speech recognition

An auditory feature extraction algorithm for robust speech recognition in adverse acoustic environments is presented. The feature computation is comprised of an outer-middle-ear transfer function, FFT, frequency conversion from linear to the Bark scale, auditory filtering, nonlinearity, and discrete cosine transform. The feature is evaluated in two tasks: connected-digit recognition and large v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000